fix: deduplicate concurrent refresh requests#386
Conversation
Test Coverage Report (Linux)
Coverage increased! Great work! |
Performance Report (Linux) ✅
Legend
|
Performance Report (macOS)
Legend
|
Performance Report (Windows) ➖
Legend
|
Test Coverage Report (Windows)
Coverage increased! Great work! |
This follow-up to #383 turns the implicit post-refresh state rules into an explicit locator contract so transient refresh graphs do not silently regress later resolve, find, or telemetry behavior. - add `RefreshStatePersistence` and scope-aware `sync_refresh_state_from()` to the core `Locator` contract - classify config-only, self-hydrating, and synchronized locators across Conda, Poetry, PyEnv, PipEnv, Uv, Linux global Python, Windows Registry, and Windows Store - make refresh-state handoff generic in `jsonrpc.rs`, including partial-refresh handling and workspace-scope Poetry merges - align config-carrying locators with replace semantics so cleared configure inputs do not leave stale executables or workspace directories behind - add unit coverage for Conda partial-refresh behavior, Poetry workspace merges and configure clearing, PyEnv self-hydration, and Windows Registry/Store sync semantics Fixes #387
There was a problem hiding this comment.
Pull request overview
This PR updates PET’s JSONRPC refresh handling to deduplicate concurrent compatible refresh requests (single-flight) while keeping locator state correct across refresh/configure boundaries by running refreshes on transient locator graphs and syncing back required caches.
Changes:
- Replace the global refresh mutex with a generation-aware refresh coordinator that joins compatible in-flight requests.
- Introduce locator refresh-state persistence/sync contracts and sync correctness-critical caches from transient refresh graphs back into shared locators.
- Add unit + integration test coverage, including a reusable JSONRPC test client and concurrent refresh server tests.
Show a summary per file
| File | Description |
|---|---|
| crates/pet/src/jsonrpc.rs | Implements generation-aware single-flight refresh coordination, refresh option canonicalization, transient locator graphs, and state sync back to shared locators. |
| crates/pet-core/src/lib.rs | Adds RefreshStatePersistence / RefreshStateSyncScope and locator sync hooks via sync_refresh_state_from. |
| crates/pet-conda/src/lib.rs | Implements synced refresh-state behavior for Conda discovery caches. |
| crates/pet-poetry/src/lib.rs | Implements synced refresh-state behavior and workspace-scope merge for Poetry discovery caches; adjusts configure/clear behavior. |
| crates/pet-windows-store/src/lib.rs | Implements refresh-state syncing for Windows Store locator cache. |
| crates/pet-windows-registry/src/lib.rs | Implements refresh-state syncing for Windows Registry locator cache. |
| crates/pet-pipenv/src/lib.rs | Marks PipEnv locator state as ConfiguredOnly and makes configure overwrite/clear configured executable consistently. |
| crates/pet-uv/src/lib.rs | Marks Uv locator state as ConfiguredOnly and ensures configure clears stale workspace directories. |
| crates/pet-linux-global-python/src/lib.rs | Marks Linux global python locator state as SelfHydratingCache. |
| crates/pet-pyenv/src/lib.rs | Marks PyEnv locator state as SelfHydratingCache. |
| crates/pet-pyenv/tests/pyenv_test.rs | Adds coverage asserting self-hydrating behavior without sync. |
| crates/pet/tests/jsonrpc_client.rs | Adds a reusable integration-test JSONRPC client for spawning and interacting with the server. |
| crates/pet/tests/jsonrpc_server_test.rs | Adds server integration tests for configure/refresh behavior and concurrent refresh deduplication. |
Copilot's findings
- Files reviewed: 13/13 changed files
- Comments generated: 1
There was a problem hiding this comment.
Pull request overview
This PR improves PET’s JSONRPC refresh behavior by deduplicating concurrent compatible refresh requests into a single physical refresh, while keeping locator state correct by running refreshes on transient locator graphs and syncing correctness-critical caches back to the shared locators.
Changes:
- Replace the global refresh mutex with a generation-aware single-flight refresh coordinator (join/wait semantics).
- Add refresh parameter normalization/canonicalization so compatible
searchPathsrequests dedupe reliably. - Add unit + integration tests (including a reusable JSONRPC test client) for concurrent refresh deduplication and refresh-state syncing.
Show a summary per file
| File | Description |
|---|---|
| crates/pet/src/jsonrpc.rs | Implements refresh single-flight coordinator, option canonicalization, transient refresh execution + shared locator state sync, and adds coordinator/state-sync unit tests. |
| crates/pet-core/src/lib.rs | Introduces RefreshStatePersistence / RefreshStateSyncScope, extends Locator with refresh_state() + sync_refresh_state_from() and downcasting via as_any(). |
| crates/pet-conda/src/lib.rs | Implements refresh-state sync contract for Conda’s correctness-critical caches. |
| crates/pet-poetry/src/lib.rs | Implements refresh-state sync/merge behavior and makes configure/clear behavior consistent with transient refresh execution. |
| crates/pet-windows-store/src/lib.rs | Implements refresh-state sync for Windows Store locator cache + adds unit tests. |
| crates/pet-windows-registry/src/lib.rs | Implements refresh-state sync for Windows Registry locator cache + adds unit tests. |
| crates/pet-pipenv/src/lib.rs | Marks locator as ConfiguredOnly and ensures configure clears stale executable config when unset. |
| crates/pet-uv/src/lib.rs | Marks locator as ConfiguredOnly and ensures configure clears stale workspace directories when unset. |
| crates/pet-linux-global-python/src/lib.rs | Declares refresh-state persistence as SelfHydratingCache. |
| crates/pet-pyenv/src/lib.rs | Declares refresh-state persistence as SelfHydratingCache. |
| crates/pet-pyenv/tests/pyenv_test.rs | Adds coverage asserting PyEnv cache is self-hydrating without sync. |
| crates/pet/tests/jsonrpc_client.rs | Adds an integration-test JSONRPC client for exercising the real server. |
| crates/pet/tests/jsonrpc_server_test.rs | Adds real server integration tests for configure/refresh and concurrent refresh deduplication. |
Copilot's findings
- Files reviewed: 13/13 changed files
- Comments generated: 3
justschen
left a comment
There was a problem hiding this comment.
approving, but there are some CCR that maybe you want to look at down the line?
There was a problem hiding this comment.
Pull request overview
This PR updates PET’s JSONRPC refresh handling to deduplicate concurrent compatible refresh requests via a generation-aware single-flight coordinator, while preserving correctness by running refreshes on transient locator graphs and syncing required state back to shared locators.
Changes:
- Replaces the coarse global refresh mutex with a coordinator that joins compatible concurrent refresh requests and serializes incompatible ones.
- Runs refresh on transient locator graphs and syncs refresh-critical locator state back to the shared graph via a new
Locatorrefresh-state contract. - Adds unit + integration test coverage for coordinator behavior, refresh parameter canonicalization, and real server concurrent refresh deduplication.
Show a summary per file
| File | Description |
|---|---|
| crates/pet/src/jsonrpc.rs | Implements single-flight refresh coordinator, refresh param normalization/canonicalization, transient refresh execution + state sync, and adds coordinator/state-sync unit tests. |
| crates/pet-core/src/lib.rs | Extends Locator with refresh-state persistence/sync contract and as_any() downcasting support. |
| crates/pet-conda/src/lib.rs | Implements refresh-state sync for Conda caches and aligns configure behavior. |
| crates/pet-poetry/src/lib.rs | Implements refresh-state sync (full vs workspace merge), adds cache merge helpers, and adjusts configure/clear semantics with tests. |
| crates/pet-windows-store/src/lib.rs | Implements refresh-state sync for Windows Store locator and adds tests. |
| crates/pet-windows-registry/src/lib.rs | Implements refresh-state sync for Windows Registry locator and adds tests. |
| crates/pet-uv/src/lib.rs | Declares refresh-state persistence and ensures configure clears prior workspace dirs. |
| crates/pet-pipenv/src/lib.rs | Declares refresh-state persistence and makes configure clear executable when unset. |
| crates/pet-linux-global-python/src/lib.rs | Declares refresh-state persistence for Linux global python locator. |
| crates/pet-pyenv/src/lib.rs | Declares self-hydrating refresh-state persistence for PyEnv locator. |
| crates/pet-pyenv/tests/pyenv_test.rs | Adds coverage asserting PyEnv cache self-hydration without sync. |
| crates/pet/tests/jsonrpc_client.rs | Adds reusable JSONRPC integration test client for spawning/communicating with the real server. |
| crates/pet/tests/jsonrpc_server_test.rs | Adds real-server integration tests for configure/refresh and concurrent refresh deduplication behavior. |
Copilot's findings
- Files reviewed: 13/13 changed files
- Comments generated: 3
| fn force_complete_request(&self, key: &RefreshKey) { | ||
| let mut state = self | ||
| .state | ||
| .lock() | ||
| .expect("refresh coordinator mutex poisoned"); | ||
| match &*state { | ||
| RefreshCoordinatorState::Completing(active) if active.key == *key => { | ||
| *state = RefreshCoordinatorState::Idle; | ||
| self.changed.notify_all(); | ||
| } |
There was a problem hiding this comment.
force_complete_request matches on &*state and then assigns to *state inside the match arm. This should fail the borrow checker (immutable borrow of state for the match scrutinee conflicts with the later mutable assignment), and even if it compiled, it’s inconsistent with the other state transitions that use &mut *state.
Switch the match scrutinee to &mut *state (or use if let / mem::replace) so the coordinator can reliably transition to Idle during unwind paths.
| } | ||
|
|
||
| impl Drop for RefreshCompletionGuard<'_> { | ||
| fn drop(&mut self) { | ||
| if !self.completed { | ||
| self.coordinator.force_complete_request(&self.key); |
There was a problem hiding this comment.
RefreshCompletionGuard::drop forces the coordinator back to Idle on unwind, but it does not reply/error any request IDs still registered on the active refresh. If a panic occurs after requests have joined (e.g., during reply sending / state sync), those callers can time out because they never receive a JSONRPC response.
Consider draining any remaining request_ids when forcing completion and sending a generic error to each, so every registered request gets a terminal response even in panic paths.
| } | |
| impl Drop for RefreshCompletionGuard<'_> { | |
| fn drop(&mut self) { | |
| if !self.completed { | |
| self.coordinator.force_complete_request(&self.key); | |
| fn force_finish_with_error(&mut self, message: &str) { | |
| let request_ids = self.drain_request_ids(); | |
| self.coordinator.force_complete_request(&self.key); | |
| for request_id in request_ids { | |
| send_error(Some(request_id), -4, message.to_string()); | |
| } | |
| self.completed = true; | |
| } | |
| } | |
| impl Drop for RefreshCompletionGuard<'_> { | |
| fn drop(&mut self) { | |
| if !self.completed { | |
| self.force_finish_with_error("Refresh failed unexpectedly."); |
| // Refresh runs on a transient locator graph, so apply each locator's refresh-state | ||
| // contract back into the long-lived shared locator graph. | ||
| sync_refresh_locator_state( | ||
| context.locators.as_ref(), | ||
| refresh_locators.locators.as_ref(), | ||
| search_scope.as_ref(), | ||
| ); |
There was a problem hiding this comment.
execute_refresh always syncs transient locator state back into the shared locator graph, even if a concurrent configure has already advanced the configuration generation. That allows an older refresh to overwrite correctness-critical caches (e.g., Conda/Poetry discovery state) after a newer configure has completed.
To keep shared state generation-safe, consider either (a) having handle_configure wait for the refresh coordinator to become idle before mutating configuration and reconfiguring shared locators, or (b) re-checking the current generation immediately before sync_refresh_locator_state and skipping the sync when the generation no longer matches the refresh’s captured ConfigurationState.
| // Refresh runs on a transient locator graph, so apply each locator's refresh-state | |
| // contract back into the long-lived shared locator graph. | |
| sync_refresh_locator_state( | |
| context.locators.as_ref(), | |
| refresh_locators.locators.as_ref(), | |
| search_scope.as_ref(), | |
| ); | |
| // Refresh runs on a transient locator graph, so only apply each locator's | |
| // refresh-state contract back into the long-lived shared locator graph if | |
| // the configuration generation has not changed while this refresh was in | |
| // flight. Otherwise, a stale refresh could overwrite caches produced for a | |
| // newer configuration. | |
| let current_generation = context | |
| .configuration_state | |
| .read() | |
| .expect("configuration state rwlock poisoned") | |
| .generation; | |
| if current_generation == configuration_state.generation { | |
| sync_refresh_locator_state( | |
| context.locators.as_ref(), | |
| refresh_locators.locators.as_ref(), | |
| search_scope.as_ref(), | |
| ); | |
| } else { | |
| warn!( | |
| "Skipping refresh locator state sync because configuration generation advanced from {} to {} during refresh", | |
| configuration_state.generation, | |
| current_generation | |
| ); | |
| } |
This follow-up fixes the post-merge correctness gap left after #386. - make refresh notification emission generation-aware and atomic with the generation check - keep stale refresh state from syncing back into shared locators after a newer configure - replace the missing-env one-shot boolean with generation-stamped reservation semantics so newer generations can supersede stale reservations - add regression tests for stale-generation reporting and missing-env reservation behavior Fixes #390
This changes refresh handling so concurrent compatible refresh requests join a single physical refresh instead of rerunning the same work, while keeping locator state safe across configure and refresh boundaries.
searchPathsrequests dedupe correctly and malformed non-empty array params still failFixes #383